KiaDev Intelligence

#compositional generalization01/07/2025

OMEGA Benchmark: Testing the Creative Limits of AI in Math Reasoning

OMEGA is a novel benchmark designed to probe the reasoning limits of large language models in mathematics, focusing on exploratory, compositional, and transformational generalization.

READ →